Conceptual Code Mining Mining for Source-Code Regularities with Formal Concept Analysis

نویسندگان

  • Kim Mens
  • Tom Tourwé
چکیده

Understanding the conceptual structure of large software systems, whether it is for software understanding or reengineering purposes, is a nontrivial task. In particular, knowing where to start the comprehension process is more difficult than it seems, especially when a system is large and complex and time is scarce. We propose an approach to mine a system’s source code automatically and efficiently for relevant concepts of interest, which we refer to as source-code regularities: what concerns are addressed in the code, what patterns, programming idioms and conventions have been adopted, and where and how they are implemented. We use formal concept analysis to do the actual source-code mining, and then filter, classify and combine the results to present them in a format that is more convenient to a software engineer. We applied a prototype tool that implements this approach to several small to medium-sized Smalltalk applications. For each of these, the tool discovered several interesting source-code regularities. Although the tool and approach can still be improved in many ways, the tool does already provide useful results when having a first contact with a system. The obtained results also illustrate the relevance and feasibility of using formal concept analysis as a technique for source code mining.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Mining Source Code for Design Regularities

The aim of this working session on Industrial Realities of Program Comprehension is to exchange and discuss experiences, opportunities, challenges and strategies for the application of program comprehension techniques in industry. In this position paper we focus on a potentially interesting opportunity and challenge for adopting program comprehension techniques, and source code mining technique...

متن کامل

Delving source code with formal concept analysis

Getting an initial understanding of the structure of a software system, whether it is for software maintenance, evolution or reengineering purposes, is a nontrivial task. We propose a lightweight approach to delve a system’s source code automatically and efficiently for relevant concepts of interest: what concerns are addressed in the code, what patterns, coding idioms and conventions have been...

متن کامل

Navigation Spaces for the Conceptual Analysis of Software Structure

Information technology of today is often concerned with information that is not only large in quantity but also complex in structure. Understanding this structure is important in many domains – many quantitative approaches such as data mining have been proposed to address this issue. This paper presents a conceptual approach based on Formal Concept Analysis. Using software source code as an exa...

متن کامل

Conceptual Modeling with Formal Concept Analysis on Natural Language Texts

The paper presents conceptual modelling technique on natural language texts. This technique combines the usage of two conceptual modeling paradigms: conceptual graphs and Formal Concept Analysis. Conceptual graphs serve as semantic models of text sentences and the data source for concept lattice – the basic conceptual model in Formal Concept Analysis. With the use of conceptual graphs the Text ...

متن کامل

Comparison and evaluation of source code

Program source code substantially is structured and contains semantically rich programming constructs such as 6 variables, functions, data structures, and program structures which indicate patterns. Mining source code by using different data 7 mining techniques to extract the valuable hidden patterns is the new revolution in software engineering. Over last decade many 8 tools and techniques hav...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004